NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c-part2 / 12950 < prev next >

Wrap

Text File | 1996-08-05 | 1.8 KB | 38 lines

Newsgroups: comp.lang.c,comp.lang.perl Path: news.inap.net!news1!ind-004-236-188 From: dlmiller@iquest.net (Doug Miller) Subject: Re: Random File Access - I don't get it X-Nntp-Posting-Host: ind-004-236-188.iquest.net Message-ID: <DpAIH0.2yC@iquest.net> Sender: news@iquest.net (News Admin) Organization: IQuest Network Services X-Newsreader: News Xpress Version 1.0 Beta #2.1 References: <3160DE1E.495C@teleport.com> Date: Wed, 3 Apr 1996 14:22:08 GMT Scott Kinard <kinards@teleport.com> wrote: >Greetings, > > As I was coding a simple program to randomly read lines from a text file it occurred >to me something was amiss. This is probably some fundamental oversight on my part, but >some illumination would be helpful... > > Suppose I have two files, one contains text (1 line of random length text per >'record') and another which is my index into the text file, which contains the starting >and ending byte positions in the text file for each 'record'. Now the question.. > >Suppose the text file has 2,000,000 entries. Now, what's the difference in reading >1,000,000 lines from the index file to find the starting and ending byte positions in >the text file for record 1,000,000, and then seeking these in the text file, rather than >just reading 1,000,000 times from the text file to get the same 'record' in the first >place? > >-Scott The *index* file would have fixed-length records. Say for example that the start and end positions are each 2-byte ints; then each record in the index file would be 4 bytes long. The 1,000,000th record in the index file would thus be located at an offset of (1M - 1) * 4 = 3,999,996 bytes. lseek to that offset in the index file and read *one* index record to retrieve the starting and ending addresses in the data file. Compute the length, lseek to the starting address, and read *one* record for the computed length. Two reads.